Model Selection

Large model compression

# Large model compression

Qwen3 32B FP8 Dynamic

An efficient language model based on Qwen3-32B with FP8 dynamic quantization, significantly reducing memory requirements and improving computational efficiency

Large Language Model

Llama 3.1 Nemotron 70B Instruct HF GGUF

A model fine-tuned based on Meta Llama-3.1-70B-Instruct, optimized with NVIDIA HelpSteer2 dataset, supporting text generation tasks.

Large Language Model English

Qwq 32B FP8 Dynamic

FP8 quantized version of QwQ-32B, reducing storage and memory requirements by 50% through dynamic quantization while maintaining 99.75% of the original model accuracy

Large Language Model

Molmo 7B O Bnb 4bit

The 4-bit quantized version of Molmo-7B-O, significantly reducing the memory requirement and suitable for environments with limited resources.

Large Language Model

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase